Bayes Factors , Relations to Minimum Description Length , and Overlapping Model Classes
نویسندگان
چکیده
This article presents a non-technical perspective on two prominent methods for analyzing experimental data in order to select among model classes. Each class consists of model instances; each instance predicts a unique distribution of data outcomes. One method is Bayesian Model Selection (BMS), instantiated with the Bayes factor. The other is based on the Minimum Description Length principle (MDL), instantiated by a variant of Normalized Maximum Likelihood (NML): the variant is termed NML* and takes prior probabilities into account. The methods are closely related. The Bayes factor is a ratio of two values: V1 for model class M1, and V2 for M2. Each Vj is the sum over the instances of Mj, of the joint probabilities (prior times likelihood) for the observed data, normalized by a sum of such sums for all possible data outcomes. NML* is qualitatively similar: The value it assigns to each class is the maximum over the instances in Mi of the joint probability for the observed data normalized by a sum of such maxima for all possible data outcomes. The similarity of BMS to NML* is particularly close when model classes do not have instances that overlap, a way of comparing model classes that we advocate generally. These observations and suggestions are illustrated throughout with use of a simple example borrowed from Heck, Wagenmakers, and Morey (2015) in which the instances predict a binomial distribution of number of success in N trials. The model classes posit the binomial probability of success to lie in various regions of the interval [0,1]. We illustrate the theory and the example not with equations but with tables coupled with simple arithmetic. Using the binomial example we carry out comparisons of BMS and NML* that do and do not involve model classes that overlap, and do and do not have uniform priors. When the classes do not overlap BMS and NML* produce qualitatively similar results.
منابع مشابه
MDL Convergence Speed for Bernoulli Sequences ∗ Jan Poland and Marcus
The Minimum Description Length principle for online sequence estimateion/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For M...
متن کاملOn the Convergence Speed of MDL Predictions for Bernoulli Sequences
We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a rate of convergence. Generally, for MDL only exponential loss bounds hold...
متن کاملMDL Convergence Speed for Bernoulli Sequences ∗ Jan Poland and Marcus Hutter
The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MD...
متن کاملTitle MDL convergence speed for Bernoulli sequences
The Minimum Description Length principle for online sequence estimateion/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For M...
متن کاملMDL Convergence Speed for Bernoulli Sequences
The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MD...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016